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Abstract. In this paper, we develop verifiable and computable performance analysis of sparsity 
recovery. We define a family of goodness measures for arbitrary sensing matrices as a set of optimiza- 
tion problems, and design algorithms with a theoretical global convergence guarantee to compute 
these goodness measures. The proposed algorithms solve a series of second-order cone programs, or 
linear programs. As a by-product, we implement an efficient algorithm to verify a sufficient condition 
for exact sparsity recovery in the noise-free case. We derive performance bounds on the recovery 
errors in terms of these goodness measures. We also analytically demonstrate that the developed 
goodness measures are non-degenerate for a large class of random sensing matrices, as long as the 
number of measurements is relatively large. Numerical experiments show that, compared with the 
restricted isometry based performance bounds, our error bounds apply to a wider range of problems 
and are tighter, when the sparsity levels of the signals are relatively low. 
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1. Introduction. 

Sparse signal recovery (or compressive sensing) has revolutionized the way we 
think of signal sampling 11 . It goes far beyond sampling and has also been applied 
to areas as diverse as medical imaging, remote sensing, radar, sensor arrays, image 
processing, computer vision, and so on. Mathematically, sparse signal recovery aims 
to reconstruct a sparse signal, namely a signal with only a few non-zero components, 
from usually noisy linear measurements: 

y = Ax + w, (1.1) 

where x G M" is the sparse signal, y € K"* is the measurement vector, A G M™^" is 
the sensing/measurement matrix, and w €E is the noise. A theoretically justified 
way to exploit the sparseness in recovering x is to minimize its £i norm under certain 
constraints [oj. 

In this paper, we investigate the problem of using the £oo norm as a performance 
criterion for sparse signal recovery via ii minimization. Although the £2 norm has been 
used as the performance criterion by the majority of published research in sparse signal 
recovery, the adoption of the £00 norm is well justified. Other popular performance 
criteria, such as the £1 and £2 norms of the error vectors, can all be expressed in 
terms of the £00 norm in a tight and non-trivial manner. More importantly, the £00 
norm of the error vector has a direct connection with the support recovery problem. 
To see this, assuming we know a priori the minimal non-zero absolute value of the 
components of the sparse signal, then controlling the £00 norm within half of that 
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value would guarantee exact recovery of the support. Support recovery is arguably 
one of the most important and challenging problems in sparse signal recovery. In 
practical applications, the support is usually physically more significant than the 
component values. For example, in radar imaging using sparse signal recovery, the 
sparsity constraints are usually imposed on the discretized time-frequency domain. 
The distance and velocity of a target have a direct correspondence to the support of 
the sparse signal. The magnitude determined by coefficients of reflection is of less 



physical significance [l][T8 19 . Refer to 27 for more discussions on sparse support 
recovery. 

Another, perhaps more important, reason to use the i^o norm as a performance 
criterion is the verifiability and computability of the resulting performance bounds. 
A general strategy to study the performance of sparse signal recovery is to define a 
measure of the goodness of the sensing matrix, and then derive performance bounds in 
terms of the goodness measure. The most well-known goodness measure is undoubt- 
edly the restricted isometry constant (RIC) fz' . Upper bounds on the £2 and £1 norms 
of the error vectors for various recovery algorithms have been expressed in terms of 
the RIC. Unfortunately, it is extremely difficult to verify that the RIC of a specific 
sensing matrix satisfies the conditions for the bounds to be valid, and even more dif- 
ficult to directly compute the RIC itself. Actually, the only known sensing matrices 
with nice RICs are certain types of random matrices [20 1. By using the i^o norm 
as a performance criterion, we develop a framework in which a family of goodness 
measures for the sensing matrices are verifiable and computable. The computability 
further justifies the connection of the too norm with the support recovery problem, 
since for the connection described in the previous paragraph to be practically useful, 
we must be able to compute the error bounds on the £00 norm. 

The verifiability and computability open doors for wide applications. In many 



practical applications of sparse signal recovery, e.g., radar imaging 24 , sensor arrays 



22 , DNA microarrays [25j, and MRI [21], it is beneficial to know the performance 
of the sensing system before its implementation and the taking of measurements. 
In addition, in these application areas, we usually have the freedom to optimally 
design the sensing matrix. For example, in MRI the sensing matrix is determined by 
the sampling trajectory in the Fourier domain; in radar systems the optimal sensing 
matrix design is connected with optimal waveform design, a central topic of radar 
research. To optimally design the sensing matrix, we need to 

1. analyze how the performance of recovering x from y is affected by A, and 
define a function ijj{A) to accurately quantify the goodness of A in the context 
of sparse signal reconstruction; 

2. develop algorithms to efficiently verify that lo{A) satisfies the conditions for 
the bounds to hold, as well as to efficiently compute lo{A) for arbitrarily given 

^' . 

3. design mechanisms to select within a matrix class the sensing matrix that is 
optimal in the sense of best lo{A). 

In this paper, we successfully address the first two points in the ^ao performance 
analysis framework. 

We now preview our contributions. First of all, we propose using the £00 norm as 
a performance criterion for sparse signal recovery and establish its connections with 
other performance criteria. We define a family of goodness measures of the sensing 
matrix, and use them to derive performance bounds on the iao norm of the recovery 
error vector. Performance bounds using other norms are expressed using the too 
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norm. Numerical simulations show that these bounds are tighter than the RIC based 
bounds when the sparsity levels of the signals are relatively small. Secondly and most 
importantly, using fixed point theory, we develop algorithms to efhciently compute 
the goodness measures for given sensing matrices by solving a series of second-order 
cone programs or linear programs, depending on the specific goodness measure being 
computed. We analytically demonstrate the algorithms' convergence to the global 
optima from any initial point. As a by-product, we obtain a fast algorithm to verify the 
sufficient condition guaranteeing exact sparse recovery via £i minimization. Finally, 
we show that the goodness measures are non-degenerate for subgaussian and isotropic 
random sensing matrices as long as the number of measurements is relatively large, a 
result parallel to that of the RIC for random matrices. 

Several attempts have been made to address the verifiability and computability 
of performance analysis for sparse signal recovery, mainly based on the RIC ^7j|9j and 
the Null Space Property (NSP) [iSj. Due to the difficulty of explicitly computing the 
RIC and verifying the NSP, researchers use relaxation techniques to approximate these 



quantities. Examples include semi-definite programming relaxation 14 15 and linear 



programming relaxation 20 . To the best of the authors' knowledge, the algorithms 
of 14 and 20 represent state-of-the-art techniques in verifying the sufficient condition 
of unique £i recovery. In this paper, we directly address the computability of the 
performance bounds. More explicitly, we define the goodness measures of the sensing 
matrices as optimization problems and design efficient algorithms with theoretical 
convergence guarantees to solve the optimization problems. An algorithm to verify 
a sufficient condition for exact £i recovery is obtained only as a by-product. Our 
implementation of the algorithm performs orders of magnitude faster than the state- 



of-the-art techniques in 14 and 20 , consumes much less memory, and produces 
comparable results. 

The paper is organized as follows. In Section [2] we introduce notations, and we 
present the measurement model, three convex relaxation algorithms, and the sufficient 
and necessary condition for exact £i recovery. In section |3j we derive performance 
bounds on the ^oo norms of the recovery errors for several convex relaxation algo- 
rithms. In Section [4j we design algorithms to verify a sufficient condition for exact £i 
recovery in the noise-free case, and to compute the goodness measures of arbitrarily 
given sensing matrices. Section [5] is devoted to the probabilistic analysis of our £oo 
performance measures. We evaluate the algorithms' performance in Section[6] Section 
[3 summarizes our conclusions. 

2. Notations, Measurement Model, and Recovery Algorithms. In this 
section, we introduce notations and the measurement model, and review recovery 
algorithms based on £i minimization. 

For any vector x £ M", the norm ||a;||fc,i is the summation of the absolute values of 
the k (absolutely) largest components of x. In particular, the £oo norm ||a;||oo — 
and the £i norm ||a;||i — ||cc||„,i. The classical inner product in M" is denoted by (•, •), 
and the £2 (or Euclidean) norm is ||a;||2 = \/ {x, x). We use || • ||o to denote a general 
norm. 

The support of x, supp(a;), is the index set of the non-zero components of x. The 
size of the support, usually denoted by the £q "norm" ||cc||o, is the sparsity level of x. 
Signals of sparsity level at most k are called fc— sparse signals. If S* C {1, • • • , n} is an 
index set, then 1 5*1 is the cardinality of S, and xs € K'"^' is the vector formed by the 
components of x with indices in S. 

We use ei, 0, O, and 1 to denote respectively the ith canonical basis vector, the 
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zero column vector, the zero matrix, and the column vector with all ones. 

Suppose a; is a /c— sparse signal. In this paper, we observe x through the following 
linear model: 

y = Ax + w, (2.1) 

where A G jg the measurement/sensing matrix, y is the measurement vector, 

and w is noise. 

Many algorithms have been proposed to recover x from y by exploiting the sparse- 
ness of X. We focus on three algorithms based on ii minimization: the Basis Pur- 
suit 



12 , the Dantzig selector 10 , and the LASSO estimator 29 



Basis Pursuit: min ||2;||i s.t. \\y — Az\\<^ < e (2.2) 
Dantzig: min ||2||i s.t. \\A^ {y - Az)\\oo < /i (2.3) 

LASSO: min hy - Az\\l+fi\\z\\i. (2.4) 

Here fj, is a tuning parameter, and £ is a measure of the noise level. All three opti- 
mization problems have efficient implementations using convex programming or even 
linear programming. 

In the noise-free case where w = 0, roughly speaking all the three algorithms 
reduce to 

min II 2; 111 s.t. Az = Ax, (2.5) 

zGK" 

which is the £1 relaxation of the NP hard £q minimization problem: 

min ||2;||o s.t. Az = Ax. (2.6) 

zGR" 

A minimal requirement on £1 minimization algorithms is the uniqueness and ex- 
actness of the solution x argmin2,.^2^^3,||a;|ji, i.e., x — x. When the true signal x 
is fc— sparse, the sufhcient and necessary condition for exact £1 recovery is 16 ITjSO] 

J2 < J2 l^^l'^^ e Ker(A), 1^1 < fc, (2.7) 

iGS i^S 

where Ker(74) {z : Az = 0} is the kernel of A, and 5 C {1, . . . , rt} is an index set. 
Expressed in terms of || • ||fe,i, the necessary and sufficient condition becomes 

II^IU,! < ^||^||i,V^eKer(A). (2.8) 

The approaches in f20^ and HZ' for verifying the sufficient condition (2.8) are 
based on relaxing the following optimization problem in various ways: 

ak = max||2;||fc.i s.t. Az = 0, ||2;||i < 1. (2.9) 

z 

Clearly, ak < 1/2 is necessary and sufficient for exact £1 recovery for fc— sparse signals. 
Unfortunately, the direct computation of (2.91 for general k is extremely difficult: it 
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is the maximization of a norm (convex function) over a polyhedron (convex set) [4j. 
In 20 , in a very rough sense ai was computed by solving n linear programs: 

min ||ei - A^yjoo, j = 1, • ■ ■ (2.10) 

where is the ith canonical basis in M". This, together with the observation that 
afc < fcofi, yields an efficient algorithm to verify (2.8). However, in 26 , we found that 
the primal-dual method of directly solving (2.8) as the following n linear programs 

max2;i s.t. Az^Q,\\z\\i<l (2.11) 

gives rise to an algorithm orders of magnitude faster. In the next section, we will 
see how the computation of ai arises naturally in the context of too performance 
evaluation. 

3. Performance Bounds on the l^o Norms of the Recovery Errors. In 

this section, we derive performance bounds on the i^o norms of the error vectors. We 
first establish a theorem characterizing the error vectors for the li recovery algorithms, 
whose proof is given in Appendix |8.1| 



Proposition 3.1. Suppose x in (2.1) is k— sparse and the noise w satisfies 
< £; ll^'^it'lloo ^ o.nd WA^wWoo < Kfj., K S (0,1), for the Basis Pursuit, the 
Dantzig selector, and the LASSO estimator, respectively. Define h = x — x as the 
error vector for any of the three £i recovery algorithms (2.2), (2.3|, and (2.4). Then 
we have 

c\\h\\k,i>\\h\\i, (3.1) 

where c = 2 for the Basis Pursuit and the Dantzig selector, and c = 2/(1 — k) for the 
LASSO estimator. 

An immediate corollary of Proposition |3.1| is to bound the £i and £2 norms of the 
error vector using the £00 norm: 

Corollary 3.2. Under the assumptions of Proposition [^Oj we have 



Ml < ck\\h\\oo, (3.2) 

|l/l|l2 < ^||/l||oo. (3.3) 

Furthermore, if S = supp(a;) and f3 — imni^s then \\h\\aa < /3/2 implies 

supp(max(|i;| — (3/2, 0) = supp(a;), (3.4) 

i.e., a thresholding operator recovers the signal support. 

For ease of presentation, we have the following definition: 

Definition 3.3. For any real number s € [l,n] and matrix A e M™^", define 

uJo{Q,s)= min irnT^' ('^■^) 

z:||z||l/||z||oc<s ||-2||oo 

where Q is either A or A. 

Now we present the error bounds on the £00 norm of the error vectors for the 
Basis Pursuit, the Dantzig selector, and the LASSO estimator. 



Theorem 3.4. Under the assumption of Proposition 3.1, we have 



00 
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for the Basis Pursuit, 



for the Dantzig selector, and 



< (^ + ^)^ (3 8) 



for the LASSO estimator. 

Proof. Observe that for the Basis Pursuit 



\A{x - a;)||2 < ||y - Ax\\2 + lly - Ax\\2 

<£ + \\Aw\\2 

< 2e, (3.9) 



and similarly, 

for the Dantzig selector, and 



\\A^A{x~x)\\^<2^l (3.10) 
A'^A{x-x)\\oc<{l + K)fi (3.11) 



for the LASSO estimator. The conclusions of Theorem 3.4 follow from equations (3.2 ) 



(3.3), and Definition |3.3[ □ 

One of the primary contributions of this work is the design of algorithms that 
compute uJo{A,s) and uJao{A'^A,s) efficiently. The algorithms provide a way to nu- 
merically assess the performance of the Basis Pursuit, the Dantzig selector, and the 



LASSO estimator according to the bounds given in Theorem 3.4 According to Corol- 



lary 3.2 the correct recovery of signal support is also guaranteed by reducing the £oo 
norm to some threshold. In Section [5) we also demonstrate that the bounds in The- 
orem [3]4] are non-trivial for a large class of random sensing matrices, as long as m is 



relatively large. Numerical simulations in Section [6] show that in many cases the error 



bounds on the £2 norms based on Corollary [3^2] and Theorem 3.4 are tighter than the 
RIC based bounds. We expect the bounds on the £00 norms in Theorem 3.4 are even 
tighter, as we do not need the relaxation in Corollary |3.2| 

We note that a prerequisite for these bounds to be valid is the positiveness of the 
involved Wo(-)- We call the validation of aj^(-) > the verification problem. Note that 



from Theorem 3.4 Wo(') > implies the exact recovery of the true signal x in the 
noise-free case. Therefore, verifying Wo(-) > is equivalent to verifying a sufficient 
condition for exact £1 recovery. 

4. Verification and Computation of w^. In this section, we present algo- 
rithms for verification and computation of Wo(-)- Wc will present a very general 
algorithm and make it specific only when necessary. For this purpose, we use Q to 
denote either A or A^A, and use || • ||o to denote a general norm. 

4.1. Verification of > 0. Verifying ll!^{Q,s) > amounts to making sure 
ll-^ll i/ll^llco < s for all z such that Qz = 0. Equivalently, we can compute 

= min s.t. Qz^O. (4.1) 



PERFORMANCE ANALYSIS OF SPARSITY RECOVERY 



7 



Then, when s < s*, we have uJo{Q, s) > 0. We rewrite the optimization (4.1 ) as 

— = max||2;||oo s.t. = 0, < 1, 



(4.2) 



which is solved using the foUowing n hnear programs: 

max^i s.t. Qz = 0, ||z||i < 1. 



(4.3) 



The dual problem for (4.3 1 is 



min \\ei — Q \\\ 



(4.4) 



where is the ith canonical basis vector. 

We solve (4.3) using the primal-dual algorithm expounded in Chapter 11 of [sj, 
which gives an implementation much more efficient than the one for solving its dual 
(4.4 1 in 120 . This method is also used to implement the £i MAGIC for sparse signal 
recovery [6]. Due to the equivalence of A^Az = and Az — 0, we always solve 
(4.2) ior Q = A and avoid Q = A^A. The former apparently involves solving linear 



programs of smaller size. In practice, we usually replace A with the matrix with 
orthogonal rows obtained from the economy-size QR decomposition of A'^ . 



As a dual of (4.4), (4.3) (and hence (4.2) and (4.1)) shares the same limitation 
as (4.4), namely, it verifies Wo > only for s up to 2\/2m. We now reformulate 



Proposition 4 of 20 in our framework: 

Proposition 4.1. [20*, Proposition 4] For any m x n matrix A with n > 32m, 
one has 



\\z\\i 

2|lno 



z = 0)- < 2V2m. 



(4.5) 



4.2. Computation of uj^. Now we turn to one of the primary contributions of 
this work, the computation of co^. The optimization problem is as follows: 

\ ■ IIQ^IIo . Il^lli / fA o\ 

uj^(Q, s) = mm --— — s.t. - — r — < s, (4.6) 



or equivalently. 



Wo((9,s) 



maxll^lloo s.t. \\Qz\\^<l,pp- <s. (4.7) 



We will show that 1/uj^{Q, s) is the unique fixed point of certain scalar function. 
To this end, we define functions fs,i{'n)^ ? = 1, . . . , n and /s(f?) over [0, oo) parameter- 
ized by s e (1, s*): 

/s,i(??) '= max{2;i : \\Qz\\^ < 1, ||2;||i < srj} 

Z 

= max{|2;,| : \\Qz\\^ < 1, \\z\\i < srj} , (4.8) 



8 



GONGGUO TANG AND ARYE NEHORAI 



since the domain for the maximization is symmetric to the origin, and 
fsiri) =^ max{||2;||oo : \\Qz\\o < 1, ||2;||i < st]} 

z 

= max max 1 2:^1 

z:||Qzl|o<l i 

= max max IzA 

i z:||Qz|U<l 
||z||i<sr, 

= max/,..,(?7), (4.9) 

where for the last but one equahty we have exchanged the two maximizations. For 
1] > 0, it is easy to show that strong duahty holds for the optimization problem 
defining fssi""!)- As a consequence, we have the dual form of fs.iiv)- 

fsM - nnn.s?7||e, - Q^A||oo + ||A||:, (4.10) 

where || • ||* is the dual norm of || • 

In the definition of fsiv): basically replaced the ||2;||oo in the denominator of 
the fractional constraint in (|4.7[) with rj. The following theorem states that the unique 



positive fixed point of /s(??) is exactly l/wo((5, s). See Appendix 8.2 for the proof. 
Theorem 4.2. The functions fsAiv) o-i^'d fs{ii) have the following properties: 
1- fs^iv) '^^d fsirj) CLT^ continuous in rj; 
^- fs^iv) '^^'^ fsiv) '^^^ strictly increasing in rj; 

3- fs^iv) concave for every i; 

4- /s(0) — 0, fsiv) > srj > r] for sufficiently small rj > 0, and there exists p < 1 
such that fs{r]) < prj for sufficiently large rj; the same holds for /s, «(??); 

5. fs.i and fsiv) have unique positive fixed points rj* — fs,i{'r]i) '^^'^ V* = fs{v*)> 
respectively; and rj* — max^ rj* ; 

6. The unique positive fixed point of fsirj), rj* , is equal to \/uJ(^{Q, s); 

1. For rj G (0,77*), we have fsiv) > Vi and for rj £ (77*, 00), we have fs{rj) < rj; 

the same statement holds also for fs,i{ij). 
8. For any e > 0, there exists pi{e) > 1 such that fs{rj) > pi{e)rj as long as 

< 77 < (1 — e)?7*; and there exists P2(e) < 1 such that fs{rj) < P2{^)il o,s long 

as r/ > (1 + e)rj* . 



Theorem 4.2 implies three ways to compute the fixed point of rj* = l/uj(^{Q,s) 
for fsiv)- 



1. Naive Fixed Point Iteration: Property 8) of Theorem 4.2 suggests that 
the fixed point iteration 

Vt+i= fsivt),t^O,l,... (4.11) 

starting from any initial point 770 > converges to 77*, no matter 7/0 < rj* or 
rjQ > rj* . The algorithm can be made more efficient in the case 770 < rj* . More 
specifically, since /s(77) — max.i fs^i{rj), at each fixed point iteration, we set 
rjt+i to be the first fs,i{rjt) that is greater than rjt + e with e some tolerance 
parameter. If for aU i, fs,i{rjt) < rjt + e, then /s(77t) = max^ fsA'Ht) < ??t + e, 
which indicates the optimal function value can not be improved greatly and 
the algorithm should terminate. In most cases, to get rjtj^^i, we need to solve 
only one optimization problem max^ 2;^ : ||(52;||o < l,||2;||i < 3774 instead of 
77,. This is in contrast to the case where 770 > 7/*, because in the later case we 
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Fig. 4.1: Illustration of the naive fixed point iteration (4.11 1 when o — oo 



must compute all fs.iivt) to update rjt+i = max^ /^^^(t]^). An update based 
on a single fs,i{'rit) might generate a value smaller than ry*. 
In Figure |4.1[ w e illustrate the behavior of the naive fixed point iteration 
algorithm ( 4.11[ ). These figures are generated by Matlab for a two dimen- 
sional problem. We index the sub-figures from left to right and from top 
to bottom. The first (upper left) sub-figure shows the star-shaped region 
S = {z : ||Q2;||oo < 1, ll^lli/ll^lloo < s}. Starting from an initial 770 < 77*, the 
algorithm solves 



max||2;||oo s.t. HQ^Ho < 1, ||2:||i < 5770 



(4.12) 



in sub-figure 2. The solution is denoted by the black dot. Although the true 
domain for the optimization in ( 4.12[ ) is the intersection of the distorted £00 
ball {z : IIQzlloo < 1} and the ii ball {z : \\z\\i < sr/o}, the intersection of the 
£1 ball (light gray diamond) and the star-shaped region S forms the effective 
domain, which is the dark grey region in the sub-figures. To see this, we 
note the optimal value of the optimization (4.12) rji = ||£c^||co = fsivo) > Vo 
according to 7) of Theorem 4.2 implying that, for the optimal solution xl, 
we have ||a;||| ||oo < ll^JiHi/yyo < s. Therefore, the optimal solution x\ 
can always be found in the dark grey region. In the following sub-figures, 
at each iteration, we expand the £1 ball until we get to the tip point of the 
star-shaped region S, which is the global optimum. 

Despite of its simplicity, the naive fixed point iteration has two major dis- 
advantages. Firstly, the stopping criterion based on successive improvement 
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is not accurate as it does not reflect the gap between rjt and rj* . This dis- 
advantage can be remedied by starting from both below and above rj* . The 
distance between corresponding terms in the two generated sequences is an 
indication of the gap to the fixed point rj* . However, the resulting algorithm 
is generally slow, especially when updating rjt+i from above ry*. Secondly, 
the iteration process is slow when close to the fixed point 77*. This is because 
/9i(e) and /92(e) in 8) of Theorem 4.2 are close to 1 for small e > 0. 
Bisection: The bisection approach is motivated by property 7) of Theorem 
4.2 Starting from an initial interval (?7l,'?u) that contains 77*, we compute 
fsivu) with 77M = (77l + ?7u)/2. As a consequence of property 7), /s(?7m) > Vm 
implies /s(77m) < V* , and we set 77l = fsivu); fsivu) < Vm implies fsivu) > 
77*, and we set 77U = /s(77m)- The bisection process can also be accelerated by 
setting 77L = fs,ii'>lM) for the first fs^iirju) greater than ?7m. The convergence 
of the bisection approach is much faster than the naive fixed point iteration 
because each iteration reduces the interval length at least by half. In addition, 
half the length of the interval is an upper bound on the gap between ?7m and 
?f , resulting an accurate stopping criterion. However, if the initial rju is too 
larger than 77*, the majority of fsivM) would turn out to be less than 77*. The 
verification of fs{r]M) < Vm needs solving n linear programs or second-order 
cone programs, greatly degrading the algorithm's performance. 
Fixed Point Iteration -I- Bisection: The third approach combines the 
advantages of the bisection method and the fixed point iteration method, 
at the level of fs^iiv)- This method relies on the representation /^(r/) = 
maxi fs,iiri) and 77* = max.; ry*. 

Starting from an initial interval (7yL0: w) and the index set Iq = {1; • • ■ 1 '^li 
we pick any io e Iq and use the (accelerated) bisection method with starting 
interval (?7lo,??u) to find the positive fixed point ry*^ of fs,ioiv)- For any 
i £ lo/io: fs,i{Vio) — ^j*o iniplies that the fixed point 77* of fsAiv) is less 
than or equal to 77*^ according to the continuity of fs,i{'r]) and the uniqueness 
of its positive fixed point. As a consequence, we remove this i from the 
index set Xq. We denote Xi as the index set after all such is removed, i.e., 
Xi = Xo/{i : fs4Vio) < <}■ We then set 77L1 = 77*^ as 77* > r]*^. Next we test 
the ii S Xi with the largest fs,i{'ni ) and construct X2 and 77L2 in a similar 
manner. We repeat the process until the index set X^ is empty. The 77* found 
at the last step is the maximal r/,*, which is equal to 77*. 



Note that in equations (4.6), (4.7), and (4.9), if we replace the loo norm with any 
other norm (with some other minor modifications), especially || • jj^ 1 or || • II2, then a 
naive fixed point iteration algorithm still exists. In addition, as we did in Corollary 



3.2 



we can express other norms on the error vector in terms of || • and || • ||2. We 
expect the norm || • \\ss would yield the tightest performance bounds. Unfortunately, 
the major problem is that in these cases, the function /s(77) do not admit an obvious 
polynomial time algorithm to compute. It is very likely the corresponding norm 
maximization defining /s(7y) for || • and || • II2 are NP hard 



5. Probabilistic Behavior of ajo(Q,s). In 26 , we defined the ^i-constrained 
minimal singular value (^i-CMSV) as a goodness measure of the sensing matrix and 
established performance bounds using £1— CMSV. For comparison, we include the 
definition below: 
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Definition 1. For any s e [1, n] and matrix A G M™^", define the ii- constrained 
minimal singular value (abbreviated as li-CMSV) of A by 

pM) = „ min (5.1) 

2: \\z\\l/\\^\\l<s \\Z\\2 



Despite the seeming resemblance of the definitions between Wo(Q, s), especiahy 
i02{A, s), and ps{A), the difi^erence in the £oo norm and the £2 norm has important im- 
pheations. As shown in Theorem |4.2[ the £00 norm enables the design of optimization 
procedures with nice convergence properties to efficiently compute uJt^{Q,s). On the 
other hand, the £i-CMSV yields tight performance bounds at least for a large class 
of random sensing matrices, as we will see in Theorem |5.2[ 

However, there are some interesting connections among these quantities, as shown 
in the following proposition. These connections allow us the analyze the probabilistic 
behavior of ujoiQ, s) using the results for PsiA) established in [26] . 

Proposition 5.1. 

Vs^ujo.{A'^A,s) > 0J2{A,s) > Ps2{A). (5.2) 



Proof. For any z such that ||2;||oo = 1 and ||2;||i < s, we have 

zA^Az<J2\z^\\iA^A^)^\ 

i 

< \\z\\,\\A^Az\\^ 

<s\\A^Az\\oo. (5.3) 
Taking the minimum over {2;:||2;||oo = l,||-z||i<s} yields 

ojUAs) <soJoc{A^A,s). (5.4) 
Note that ||2;||i/||2;||oc- < s impfies ||2;||i < s||2;||oo < sjl^lb, or equivalently, 

{z : ll^lli/ll^lloo <s}Q{z: ||2||i/||2||2 < s}. (5.5) 
As a consequence, we have 

\\AZ\\2 \\Z\\2 



UJ2iA,s) 



> 



\\z\\i/\\z\\^<s \\Z\\2 \\z\\oo 

Uzh 

mm 11 11 

\\z\\l/\\z\\x,<s \\Z\\2 
W\2 



> min 

Iklli/lkll2<s ||2;|l2 

= P,2{A), (5.6) 



where the first inequality is due to ||2;||2 > ll^lloo, and the second inequality is because 
the minimization is taken over a larger set. □ 

As a consequence of the theorem we established in [26] and include below, we 
derive a condition on the number of measurements to get uj<){Q, s) bounded away from 
zero with high probability for sensing matrices with i.i.d. subgaussian and isotropic 
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rows. Note that a random vector X G M" is ealled isotropic and subgaussian with 
constant L if E| {X,u) p = and P(| {X,u) | > t) < 2 exp(-tV(i|| wlb)) hold for 
any m e M". 

Theorem 5.2. ^26, Let the rows of the scaled sensing matrix ^/mA be i.i.d. sub- 
gaussian and isotropic random vectors with numerical constant L. Then there exist 
constants c\ and C2 such that for any e > and m > I satisfying 

m>c,'^, (5.7) 

we have 

E|l-p,(^)| <e, (5.8) 

and 

P{1 - e < ps{A) < 1 + e} > 1 - exp(-C2e^m/i^). (5.9) 



Theorem 5.3. Under the assumptions and notations of Theorem \5.2\ there exist 
constants c\ and C2 such that for any e > and m > 1 satisfying 



m > ci 



L'^s'^ logn 



have 



E uj2{A,s) > 1-e, 

P{cj2(A, s) > 1 - e} > 1 - exp(-C2e^TO), 



and 



E ujoo{A^A,s) > 



(1-6)^ 



(1 - e)2 ' 

Woo(^, s) > (■ > 1 - exp(-C2e^m). 



(5.10) 



(5.11) 
(5.12) 



(5.13) 
(5.14) 



Sensing matrices with i.i.d. subgaussian and isotropic rows include the Gaussian 
ensemble, and the Bernoulli ensemble, as well as the normalized volume measure on 
various convex symmetric bodies, for example, the unit balls of ip 



23 



for 2 < p < oo 

In equations (|5.13|) and (|5.14[), the extra s in the lower bound of iOadA'^A, s) would 



contribute an s factor in the bounds of Theorem |3.4| It plays the same role as the 
extra Vk factor in the error bounds for the Dantzig selector and the LASSO estimator 
in terms of the RIG and the £i-GMSV nE|26 . 



The measurement bound (5.10) implies that the algorithms for verifying > 
and for computing work for s at least up to the order -^/m/logn. The order 



y^m/ log n is complementary to the -y/ro upper bound in Proposition 
Note that Theorem 5.2 implies that the following program: 



4.1 



max II 2 II 2 s.t. Az = 0, ||2;|| i < 1, 



(5.15) 



verifies the sufficient condition for exact £i recovery for s up to the order m/logn, 
at least for subgaussian and isotropic random sensing matrices. Unfortunately, this 
program is NP hard and hence not tractable. 
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6. Numerical Experiments. In this section, we provide implementation de- 



tails and numerically assess the performance of the algorithms for solving (4.6) using 
the naive fixed point iteration. The numerical implementation and performance of 
(4.1) were previously reported in 26 and hence are omitted here. All the numerical 
experiments in this section were conducted on a desktop computer with a Pentium D 
CPU@3.40GHz, 2GB RAM, and Windows XP operating system, and the computa- 
tions were running single-core. 

Recall that the optimization defining fssiv) is 



min2;j s.t. ||Q2;||o < 1, ||^||i < srj. 



(6.1) 



Depending on whether o = l,cx), or 2, (6.1) is solved using either linear programs 
or second-order cone programs. For example, when o = c», we have the following 
corresponding linear programs: 



lin [ 



0^ 



z 
u 



s.t. 



Q 


o 






1 


-Q 


o 








1 


I 


I 




z 


< 





-I 


-I 




u 













0^ 


1^ 






srj 



•2) 



These linear programs are implemented using the primal-dual algorithm outlined in 
Chapter 11 of [sj. The algorithm finds the optimal solution together with optimal 
dual vectors by solving the Karush-Kuhn- Tucker condition using linearization. The 
major computation is spent in solving linear systems of equations with positive definite 
coefficient matrices. When o = 2, we rewrite (6.1) as the following second-order cone 
programs 



min f ej 0"^ 



z 
u 



s.t. 



[Q o] 



z 
u 



I -I 
-I -I 

0^ 1^ 



z 
u 



< 



- 1 < 






srj 



(6.3) 



We use the log-barrier algorithm described in Chapter 11 of [5 to solve (6.3 1. In- 
terested readers are encouraged to refer to |6 for a concise exposition of the general 
primal-dual and log-barrier algorithms and implementation details for similar linear 
programs and second-order cone programs. 

We test the algorithms on Bernoulli, Gaussian, and Hadamard matrices of differ- 
ent sizes. The entries of Bernoulli and Gaussian matrices are randomly generated from 
the classical Bernoulli distribution with equal probability and the standard Gaussian 
distribution, respectively. For Hadamard matrices, first a square Hadamard matrix 
of size n (n is a power of 2) is generated, then its rows are randomly permuted and its 
first m rows are taken as an m x n sensing matrix. All mx n matrices are normalized 
to have columns of unit length. 
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We compare our recovery error bounds based on cj^ with those based on the RIC. 
Combining Corollary |3 . 2 1 and Theorem |3.4[ we have for the Basis Pursuit 



\X - X\\2 



< 



2V2k 

W2(A,2fc)^ 



and for the Dantzig selector 



\x — CC 9 < 



2V2fc 



For comparison, the two RIC bounds are 



\x - xh, < 



4^/1 + 62k{A) 
1 - {1 + V2)S2k{A) 

for the Basis Pursuit, assuming S2k{A) < V^, — 1 [t], and 



\x - a; 2 < 



(6.4) 



(6.5) 



(6.6) 



(6.7) 



for the Dantzig selector, assuming S2k{A)+63k{A) < 1 [To]. Without loss of generality, 
we set £ = 1 and ^ = 1. 

The RIC is computed using Monte Carlo simulations. More explicitly, for S2k{A), 
we randomly take 1000 sub-matrices of A G E™^" of size m x 2k, compute the 
maximal and minimal singular values cri and a2k, and approximate S2kiA) using the 
maximum of max(crj — 1, 1 — cr|j,) among all sampled sub- matrices. Obviously, the 
approximated RIC is always smaller than or equal to the exact RIC. As a consequence, 
the performance bounds based on the exact RIC are worse than those based on the 
approximated RIC. Therefore, in cases where our Wo based bounds are better (tighter, 
smaller) than the approximated RIC bounds, they are even better than the exact RIC 
bounds. 



In Tables 6.1 6.2 and 6.3 we compare the error bounds (6.4) and (6.6) for the 
Basis Pursuit algorithm. In the tables, we also include computed by (4.2), and 
/c* = [s*/2j, i.e., the maximal sparsity level such that the sufficient and necessary 
condition (2.7) holds. The number of measurements is taken as to = lpn\,p = 
0.2, 0.3, . . . , 0.8. Note the blanks mean that the corresponding bounds are not valid. 
For the Bernoulli and Gaussian matrices, the RIC bounds work only for k < 2, even 
with TO ~ [0.8nJ, while the uj2{A,2k) bounds work up until k = 9. Both bounds are 
better for Hadamard matrices. For example, when to = 0.5n, the RIC bounds are 
valid for k < 3, and our bounds hold for fc < 5. In all cases for n = 256, our bounds 
are smaller than the RIC bounds. 

We next compare the error bounds (6.5) and (6.7) for the Dantzig selector. For 
the Bernoulli and Gaussian matrices, our bounds work for wider ranges of {k,m) 
pairs and are tighter in all tested cases. For the Hadamard matrices, the RIC bounds 
are better, starting from k > 5 or 6. We expect that this indicates a general trend, 
namely, when k is relatively small, the uj based bounds are better, while when fc is 
large, the RIC bounds are tighter. This was suggested by the probabilistic analysis 
of UJ in Section [5j The reason is that when fc is relatively small, both the relaxation 
||a;||i < 2fc||a;||oo on the sufficient and necessary condition (2.7) and the relaxation 
11^^ ^ x\\2 < V2k\\x — ccjioc are sufficiently tight. 
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Table 6.1: Comparison of the 0^2 based bounds and the RIC based bounds on the £2 
norms of the errors of the Basis Pursuit algorithm for a Bernoulli matrix with leading 
dimension n = 256. 





m 


51 


77 


102 


128 


154 


179 


205 


S.^, 


4.6 


6.1 


7.4 


9.6 


12.1 


15.2 


19.3 


k 




2 


3 


3 


4 


6 


7 


9 


1 


uj bd 
ric bd 


4.2 


3.8 


3.5 
23.7 


3.4 
16.1 


3.3 
13.2 


3.2 

iU.O 


3.2 


2 


UJ bd 
ric bd 


31.4 


12.2 


9.0 


7.4 


6.5 


6.0 

72.1 


5.6 

1 Q9 9 

lyz.z 


3 


UJ bd 
ric bd 




252.0 


30.9 


16.8 


12.0 


10.1 


8.9 


4 


UJ bd 
ric bd 




52.3 


23.4 


16.5 


13.6 


5 


UJ bd 
ric bd 




57.0 


28.6 


20.1 


6 


UJ bd 
ric bd 




1256.6 


53.6 


30.8 


7 


UJ bd 
ric bd 




161.6 


50.6 


8 


UJ bd 
ric bd 




93.1 


9 


UJ bd 
ric bd 




258.7 



In Table [677| we present the execution times for computing different uj. For random 
matrices with leading dimension n = 256, the algorithm generally takes 1 to 3 minutes 
to compute either uj2{A, s) or ujao{A'^A, s). 

In the last set of experiments, we compute a;2(^, 2fc) and uJoo{A'^ A,2k) for a 
Gaussian matrix and a Hadamard matrix, respectively, with leading dimension n = 
512. The row dimensions of the sensing matrices range over m = \_pn\ with p = 
0.2, 0.3, . . . , 0.8. In Figure [6T| we compare the £2 norm error bounds of the Basis 
Pursuit using uj2{A,2k) and the RIC. The color indicates the values of the error 
bounds. We remove all bounds that are greater than 50 or are not valid. Hence, all 
white areas indicate that the bounds corresponding to (fc,m) pairs that are too large 
or not valid. The left sub- figure is based on uj2{A, 2k) and the right sub-figure is based 
on the RIC. We observe that the a;2(^, 2A;) based bounds apply to a wider range of 
(fc, m) pairs. 



In Figure 6.2, we conduct the same experiment as in Figure 6.1 for a Hadamard 
matrix and the Dantzig selector. We observe that for the Hadamard matrix, the RIC 
gives better performance bounds. This result coincides with the one we obtained in 
Table Ell 

The average time for computing each uj2{A, 2k) and ujoa{A^A, 2k) was around 15 
minutes. 

7. Conclusions. In this paper, we analyzed the performance of £1 sparse sig- 
nal recovery algorithms using the £^0 norm of the errors as a performance criterion. 
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20 40 60 80 100 



Fig. 6.1: uji^A-, 2k) based bounds v.s. RIC based bounds on the £2 norms of the errors 
for a Gaussian matrix with leading dimension n = 512. Left: 0J2{A, 2k) based bounds; 
Right: RIC based bounds. 




20 40 60 80 100 



Fig. 6.2: ujooiA^^ A,2k) based bounds v.s. RIC based bounds on the £2 norms of the 
errors for a Hadamard matrix with leading dimension n = 512. Left: u!2{A, 2k) based 
bounds; Right: RIC based bounds 
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Table 6.2: Comparison of the based bounds and the RIC based bounds on the 
^2 norms of the errors of the Basis Pursuit algorithm for a Hadamard matrix with 
leading dimension n — 256. 





m 


51 


77 


102 


128 


154 


179 


205 


s* 


5.4 


7.1 


9.1 


11.4 


14.0 


18.4 


25.3 


k 




2 


3 


4 


5 


6 


9 


12 


1 


w bd 

T"i f V\r\ 


3.8 
46.6 


3.5 
13.2 


3.3 
9.2 


3.2 
9.4 


3.1 
8.3 


3.0 


3.0 


2 


CJ bd 
ric bd 


13.7 


8.4 


6.7 
46.6 


5.9 

24.2 


5.4 
15.3 


4.9 
o.u 


4.6 

7 1 


3 


UJ bd 




30.9 


14.0 


10.1 
1356.6 


8.4 
25.4 


7.1 


6.3 

o.o 


4 


UJ bd 
ric b d 




47.4 


18.9 


13.2 
40.0 


9.9 
14.0 


8.1 
1 n 9 


5 


Lo bd 

TTI c \~\r\ 




51.5 


22.6 


13.8 
18.8 


10.3 


6 


o; bd 

1 il^ Li (.1 




50.8 


20.1 
42.5 


13.1 


7 


o; bd 
ric bd 




31.8 
94.2 


16.7 
19.7 


8 


UJ bd 
ric bd 




63.5 
1000.0 


21.7 
24.6 


9 


w bd 
ric bd 




449.8 


29.4 
39.1 


10 


ijj bd 
ric bd 




42.8 
35.6 


11 


UJ bd 
ric bd 




72.7 
134.1 


12 


w bd 
ric bd 




195.1 



We expressed other popular performance criteria in terms of the loo norm. A family 
of goodness measures of the sensing matrices was defined using optimization proce- 
dures. We used these goodness measures to derive upper bounds on the i^a norms of 
the reconstruction errors for the Basis Pursuit, the Dantzig selector, and the LASSO 
estimator. Polynomial-time algorithms with established convergence properties were 
implemented to efficiently solve the optimization procedures defining the goodness 
measures. We expect that these goodness measures will be useful in comparing dif- 
ferent sensing systems and recovery algorithms, as well as in designing optimal sens- 
ing matrices. In future work, we will use these computable performance bounds to 
optimally design fc— space sample trajectories for MRI and to optimally design trans- 
mitting waveforms for compressive sensing radar. 

8. Appendix: Proofs. 



8.1. Proof of Proposition 3.1, Proof. [Proof of Proposition 3.1 Suppose 
S = supp(a;) and \S\ — \\x\\o — k. Define the error vector h = x — x. For any vector 
z e M" and any index set 5' C {1, . . . we use zs £ M'"^' to represent the vector 
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Table 6.3: Comparison of the L02 based bounds and the RIC based bounds on the ^2 
norms of the errors of the Basis Pursuit algorithm for a Gaussian matrix with leading 
dimension n = 256. 





m 


51 


77 


102 


128 


154 


179 


205 


s* 


4.6 


6.2 


8.1 


9.9 


12.5 


15.6 


20.0 


k 




2 


3 


4 


4 


6 


7 


10 


1 


Lo bd 
ric bd 


4.3 


3.7 


3.5 
26.0 


3.4 
14.2 


3.3 
10.0 


3.2 

1 n Q 


3.2 

1 9 1 
iz. ± 


2 


Lo bd 
ric bd 


34.3 


12.3 


8.3 


7.0 


6.4 


5.9 
47.1 


5.6 


3 


w bd 

lie UQ 




197.4 


23.4 


14.5 


11.6 


9.8 


8.9 


4 


uj bd 
ric bd 




1036.6 


39.6 


21.7 


15.9 


13.4 


5 


UJ bd 
ric bd 




49.3 


26.4 


20.0 


6 


u) bd 
ric bd 




284.2 


48.8 


31.2 


7 


Lo bd 
ric bd 




129.1 


48.1 


8 


Lo bd 
ric bd 




185.5 


9 


u) bd 
ric bd 




9640.3 



whose elements are those of z indicated by S. 



We first deal with the Basis Pursuit and the Dantzig selector. As observed by 
Candes in [T], the fact that = ||a; + /i||i is the minimum among all zs satisfying 
the constraints in (2.2) and (2.3), together with the fact that the true signal x satisfies 



the constraints as required by the conditions imposed on the noise in Proposition 3.1 
imply that 1 1 /15c ||]^ cannot be very large. To see this, note that 

>\\x + h\\i 

= ^\x, + hi\ + \x, + h,\ 



ies 
>Ns||i 

= ll^lll- 



- \\hs\\i + \\hs4i 
\hs\\i + ll^s-lli- 



Therefore, we obtain Ij/j-sHi > ||/isc||i, which leads to 

2||/i5||l>||/^s||l + ||/^5.= ||l = ||/x|ll. 



(8.1) 



(8.2) 



We now turn to the LASSO estimator(2.4|. We use the proof technique in [8J (see 
also [3]). Since the noise w satisfies || A t(j||oo < nfi for some small k > 0, and a; is a 
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Table 6.4: Comparison of the lo^ based bounds and the RIC based bounds on the £2 
norms of the errors of the Dantzig selector algorithm for the Bernoulli matrix used in 
Table lei 





m 


51 


77 


102 


128 


154 


179 


205 




4.6 


6.1 


7.4 


9.6 


12.1 


15.2 


19.3 


k 


K 


2 


3 


3 


4 


6 


7 


9 


1 


to bd 
nc bd 


6.0 


5.4 
46.3 


4.8 
17.4 


4.4 
12.1 


4.2 
11.2 


4.1 
1 n 

lu. 


4.1 

o.u 


2 


to bd 
ric bd 


102.8 


38.4 


29.0 


18.5 


14.1 


12.8 
47.2 


11.9 

99 R 


3 


UJ bd 
ric bd 




1477.2 


170.2 


81.2 


57.0 


41.1 


32.6 


4 


UJ bd 
ric bd 




522.7 


194.6 


128.9 


89.0 


5 


to bd 
ric bd 




768.7 


323.6 


203.2 


6 


UJ bd 
ric bd 




24974.0 


888.7 


489.0 


7 


UJ bd 
ric bd 




3417.3 


1006.9 


8 


UJ bd 
ric bd 




2740.0 


9 


UJ bd 
ric bd 




10196.9 



solution to (2.4 1, we have 
1 



^„Ax-y\\l+ ^l\\x\\l < -\\Ax - y\\j + fi\\x\\i. 



Consequently, substituting y = Ax + w yields 



2 - \\\Mx-x) --ujII^ +^||a;|li 



f^\\x\\i < ^\\Ax-y\\l~ - \\Ax - y\\l + fi\\x\\i 

_ 1 
^ 2 

= ^-\\w\\l~\\\Aix-x)\\l 
+ {A{x^x),w) - 



< {A{x — x),w) + 1 
— (^x — X, A'^w') + ii\\x\\i. 

Using the Cauchy-Swcharz type inequality, we get 

Mplli < \\x - a;||i||A^i(;||oo 
= K^i\\h\\i + n\\x\\i, 



-^,\\x\\ 
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Table 6.5: Comparison of the based bounds and the RIC based bounds on the £2 
norms of the errors of the Dantzig selector algorithm for the Hadamard matrix used 
in Table 



6.2 





m 


51 


77 


102 


128 


154 


179 


205 




5.2 


6.9 


9.1 


12.1 


14.4 


18.3 


25.2 


k 




2 


3 


4 


6 


7 


9 


12 


1 


UJ bd 
nc bd 


4.8 


4.0 
15.6 


3.8 
9.3 


3.4 
7.0 


3.4 

u.o 


3.2 

0.0 


3.1 
. ± 


2 


UJ bd 
ric bd 


50.9 


16.2 


10.1 
45.3 


7.1 
16.6 


7.0 
13.7 


6.1 

lU.U 


5.3 

0.0 


3 


UJ bd 

1 iL- ULl 




108.2 


30.7 
1016.4 


14.3 
29.9 


13.9 
24.9 


10.0 

LO.O 


8.0 

1 9 ^ 


4 


u bd 
nc bd 




150.7 


35.3 
126.4 


29.3 
38.7 


16.8 
24.2 


11.7 
1 fi fi 

xu.u 


5 


u bd 

i DLL 




108.5 


64.2 
187.3 


31.4 
30.0 


17.3 
99 1 


6 


oj bd 




3168.9 


171.5 
112.0 


59.7 
53.1 


25.3 
9fi 8 


7 


bd 
ric bd 




1499.5 
411.7 


116.3 
71.3 


38.8 
34.7 


8 


UJ bd 
ric bd 




265.3 
95.4 


61.4 
47.6 


9 


UJ bd 
ric bd 




2394.0 
198.7 


96.0 
61.9 


10 


UJ bd 
ric bd 




157.4 
82.9 


11 


UJ bd 
ric bd 




296.4 
130.3 


12 


bd 
ric bd 




898.2 
201.2 



which leads to 



*||i < '^ll^lli + 



Therefore, similar to the argument in (8.1 1, we have 
l^lli 



> 



> 



x||i - K\\h\\l 

x + hs<-- +hs\\i - K{\\hsc +hs\\i) 

X + hs-=\\i ~ \\hs\\i - K + II ''-sill) 

a;||i + (l-K)||/isc||i-(l + K)||/is||i, 



where S = supp(a;). Consequently, we have 

1 - K, 



\h 



sill 



> 



1 + K 



hsA\i- 
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Table 6.6: Comparison of the lo^ based bounds and the RIC based bounds on the £2 
norms of the errors of the Dantzig selector algorithm for the Gaussian matrix used in 
Table 



6.3 





m 


51 


77 


102 


128 


154 


179 


205 




4.6 


6.2 


8.1 


9.9 


12.5 


15.6 


20.0 


k 




2 


3 


4 


4 


6 


7 


10 


1 


uj bd 
ric bd 


6.5 


5.1 
30.0 


4.8 
18.0 


4.3 
14.6 


4.2 
9.7 


4.0 
Q 


3.9 
Q 1 


2 


UJ bd 
ric bd 


119.4 


37.8 


22.5 


17.6 


14.1 
91.5 


12.7 
44.4 


11.4 
zo.o 


3 


Lo bd 




1216.7 


120.7 


67.3 


53.6 


38.7 


36.4 
2546 6 


4 


UJ bd 
ric bd 




4515.9 


318.2 


168.4 


115.8 


109.0 


5 


UJ bd 
ric bd 




663.6 


292.4 


247.8 


6 


UJ bd 
ric bd 




5231.4 


764.3 


453.5 


7 


UJ bd 
ric bd 




2646.4 


1087.7 


8 


UJ bd 
ric bd 




2450.5 


9 


UJ bd 
ric bd 




6759.0 



Therefore, similar to (8.2), we obtain 



2 II, II ^ I^^'^IIT II 1 '^IIT II 

\hs\\i > \\hs\\i + \\hs\\i 



1 — K 1 — K 1 — K 

l^/S^l /^IIT II 1 '^III II 

> 1 y—\\hs4i + 1 ll^slli 

1 — k1 + k 1 — k 

= \Mi. (8.3) 



8.2. Proof of Theorem [4721 Proof. 

1. Since in the optimization problem defining /s,i(?y), the objective function Zi 
is continuous, and the constraint correspondence 

df]) :[0,oo) ^M" 

^j^{z:\\Qz\U<l,\\z\\,<s7^} (8.4) 

is compact- valued and continuous (both upper and lower hemicontinuous) , 
according to Berge's Maximum Theorem 2 , the optimal value function f^.i (rj) 
is continuous. The continuity of fsifj) follows from that finite maximization 
preserves the continuity. 

2. To show the strict increasing property, suppose < ?7i < 772 and the dual 
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Table 6.7: Time in seconds taken to compute uj2{A, •) and uj^{A'^A, •) for Bernoulli, 
Hadamard, and Gaussian matrices 



k 


type 


m 


51 


77 


102 


128 


154 


179 


205 


1 


Bernoulli 


U)2 
Woo 


118 
75 


84 
81 


133 
84 


87 
65 


133 
63 


174 
144 


128 
151 


Hadamard 


UJ2 

<^oo 


84 
57 


82 
55 


82 
58 


82 
58 


QO 

58 


vn 
i\) 

58 


/y 
57 


Gaussian 


CJ2 
l^oo 


82 
69 


84 
65 


212 
72 


106 
102 


lob 
81 


IOC 

lo5 
104 


104 

72 


3 


Bernoulli 


UJ2 
l^oa 




155 
300 


96 
228 


95 
190 


97 
125 


AT 

97 
135 


131 
196 


Hadamard 


UJ2 

l^oo 




91 

84 


88 
83 


87 
77 


o o 

92 


(4 
102 


TO 

11 
70 


Gaussian 


UJ2 

<^oo 




134 
137 


168 
142 


115 
125 


95 

165 


9o 
145 


100 
105 


5 


Bernoulli 


W2 

<^oo 




9 ( 
156 


111 
ill 

81 


97 

107 


Hadamard 


L02 
l^oa 




87 
75 


85 
74 


85 
75 


81 
75 


Gaussian 


UJ2 

^oa 




98 


105 


96 
193 


7 


Bernoulli 


UJ2 
^oo 






164 
178 


104 

85 


Hadamard 


W2 
^oo 




134 


82 
71 


77 

65 


Gaussian 


ijJ2 

l^oo 






106 


105 
193 



variable A2 achieves Js^ii'f]^) in (4.101. Then we have 



fsAvi) < srii\\ei - Q'^AjIIoo + IIA2 

< S772||e, -q'^a;iu + ||a. 
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= fsAm). (8.5) 

The case for 771 = is proved by continuity, and the strict increasing of fs (77) 
follows immediately. 



3. The concavity of fs.iii]) follows from the dual representation (4.101 and the 
fact that fs,i{T]) is the minimization of a function of variables rj and A, and 
when A, the variable to be minimized, is fixed, the function is linear in rj. 

4. Next we show that when 77 > is sufficiently small fs{v) ^ sry. Taking 
z = srjei, we have |j2:||i = srj and Zi = srj > rj (recall s G (l,oo)). In addition, 
when < 7] < l/{s\\Qi\\o), we also have ||Qz||o < 1. Therefore, for sufficiently 
small r], we have fs,i{r]) > sr] > rj. Clearly, /^(ry) = max^ fs.iir]) > srj > rj for 
such rj. 

Recall that 

— = maxmin Ije^ — Q^'^AilloQ. (8.6) 
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Suppose A* is the optimal solution for each min^i He^ — Xi\\ao- For each i, 
we then have 

->||e.-g^A:|U, (8.7) 

which implies 

fs,%{'n) = minsTyllei - Q^AJoo + \\K\\l 
<sv\\e,-Q^K\\oo + \\K\\l 

< f'?+l|A*||:. (8.8) 
As a consequence, we obtain 

fs{r/) = max/s,.,(7/) < — + max || A* ||;. (8.9) 

Pick p £ (s/s*, 1). Then, we have the following when rj > max^ ||A*||^/(p — 
s/s*): 

fsAv) < PVJ = l,---,n, and 
fsiv) < PV- (8.10) 

5. We first show the existence and uniqueness of the positive fixed points for 
fs,i{r])- The properties 1) and 4) imply that fs,i{v) has at least one positive 
fixed point. (Interestingly, 2) and 4) also imply the existence of a positive 
fixed point, see [28].) To prove uniqueness, suppose there are two fixed points 
< t;^ < f?2- Pick 770 small enough such that /s,i(?7o) > f/o > and 770 < ?7i. 
Then -ql Aryo + (1 — A)?72 for some A € (0, 1), which implies that fs,i{i]l) > 
A/s,i(?7o) + (1 - A)/^^i(77*) > A?7o + (1 - A)??^ = 77* due to the concavity, 
contradicting with tjI = fs,i{Vi)- 

The set of positive fixed point for fs(ji)-i {"H ^ (0; 0°) ■ V ~ fsiv) ^ max^ fs,iiv)}^ 
is a subset of {j^^iirj £ (0,oo) : t] = fs,i{v)} — {Vi}?=i- We argue that 

77* = max 77* (8.11) 

i 

is the unique positive fixed point for fs{'n)- 

We proceed to show that 77* is a fixed point of fsiv)- Suppose 77* is a 
fixed point of fs,io{'n)i then it suffices to show that fsiv*) = max^ fs,i{jf) = 
fs,ioiv*)- If this is not the case, there exists ii 7^ iq such that fs^iiir]*) > 
fs,io{v*) = V* ■ The continuity of fs,i-i^{ii) and the property 4) imply that there 
exists 77 > 77* with fs,ii{r]) = rj, contradicting with the definition of rf . 
To show the uniqueness, suppose rjl is fixed point of fsAiiv) satisfying r]l < 
if. Then, we must have fs,io{Vi) > /s,ii(^i) because otherwise the continuity 
implies the existence of another fixed point of fs.ioiv)- a consequence, 
fsiVi) > fs,n{Vi) = Vi and 77^ is not a fixed point of fsirj). 

6. Next we show 77* — 7* =^ l/wo((3,s). We first prove 7* > ?]* for the fixed 
point 77* — fs{ri*). Suppose z* achieves the optimization problem defining 
fsiv*) J then we have 

V* = fsinl = ll^lloo, IIQ^IU < l,and < 57;*. (8.12) 
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Since ||2:*||i/||2;*||oo < st]* /rj* < s, we have 



7->^>,-^ (8.13) 



If rj* < 7*, we define 770 — {rj* + 7*)/2 and 

2:= = argmax,4^ s.t. \\Qz\\^ < l,\\z\\^ > r,o, (8.14) 

Suppose z** with = 1 achieves the optimum of the optimization 

( |4.6[ ) defining 7* = l/a;o(Q, s). Clearly, ||2**||oo = 7* > f]o, which implies 
z** is a feasible point of the optimization problem (8.14) defining z'^ and p. 
As a consequence, we have 

s|zq|oo^^_ (8.16) 




Fig. 8.1: Illustration of the proof for p > 1. 



Actually we will show that p > 1. If ||2**||i < s\\z 
{^.e.,\\z**\\,^s\\z■ 
which satisfies 



3), as illustrated in Figure 



8.1 



IOII|o = ^ <1, 

7 



llllcc 



= riQ, and 



loo, we are done. If not 
we consider | = ^z**, 



(8.17) 

(8.18) 
(8.19) 



To get ^" as shown in Figure 8.1 pick the component of ^ with the smallest 
non-zero absolute value, and scale that component by a small positive con- 
stant less than 1. Because s > 1, ^ has more than one non-zero components, 
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implying ||^"||oo will remain the same. If the scaling constant is close enough 
to 1, IIQ^'^llo will remain less than 1 due to continuity. But the good news is 
that decreases, and hence p > ^jj|njj^ becomes greater than 1. 

Now we proceed to obtain a contradiction that /s(?7*) > If < s • 77*, 

then it is a feasible point of 

max||2;||oo s.t. HQ^Ho < 1, < s • ?7*- (8-20) 

z 

As a consequence, /s('7*) > 1| 2*^11 00 > > contradicting with rj* is a fixed 
point and we are done. If this is not the case, i.e., \\z'^\\i > s • rj* , we define a 
new point 

z° = Tz" (8.21) 

with 

T = < 1. (8.22) 

Note that z"^ is a feasible point of the optimization problem defining /s(?7*) 
since 

IIQz'^llo = t\\Qz% < 1, and (8.23) 

||z"||i =r||z=||i =s-?7*. (8.24) 

Furthermore, we have 

II^"I|oo=t||z^||oo = W*. (8.25) 
As a consequence, we obtain a contradiction 

fsivl > m* > n*- (8.26) 

Therefore, for the fixed point rf , we have Jy* = 7* = l/uj(^{Q, s). 

7. This property simply follows from the continuity, the uniqueness, and prop- 
erty 4). 

8. We use contradiction to show the existence of pi(e) in 8). In view of 4), we 
need only to show the existence of such a pi(e) that works for rj^ < rj < 
(1 — e)r]* where r/L = supjr/ : /«($) > s^,VO < ^ < 77}. Suppose otherwise, we 

then construct sequences {r]^''^'^^i C [?7l,(1 - e)^*] and {Pi^^}kLi C (l,oo) 
with 

lim p[''^ = 1, 

fc— )-CSO 

/.(??(')) <P<'^r?W. (8.27) 

Due to the compactness of [7?l,(1 ~ there must exist a subsequence 

W^'^fZi of {r?^'''} such that liuii^^ t]'^'"'> = -qu^ for some ?7iii„ € [771,, (1 - 
e)r]*]. As a consequence of the continuity of fsiif), we have 

/,(77ii^) = lim fsin^"''^) < lim ''ryC^') = ryu^. (8.28) 

/— >-CSO 1^00 
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Again due to the continuity of fsiv) a.nd the fact that fsiv) < V ^ot r/ < rj^, 
there exists ijc & [rjL, f?iim] such that 

fsiVc)=Vc, (8.29) 

contradicting with the uniqueness of the fixed point for fsifj). The existence 
of p2(e) can be proved in a similar manner. 
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